Hierarchical Neural Networks for Partial Diagnosis in Medicine
نویسندگان
چکیده
Various domains require hierarchical classification. In medicine, learning partial diagnoses can be helpful when time and information constraints are present. Hierarchical neural networks provide a good means to perform partial diagnosis. We implemented a hierarchical backpropagation-based model for the domain of thyroid diseases, and compared the results against those of nonhierarchical networks in terms of sensitivities and specificities. In our system, high-level neural networks filter instances that are relevant for use in specialized neural networks. The hierarchical model required fewer epochs to be trained and yielded a higher classification rate in the test set than did the nonhierarchical one. The hierarchical model also had the advantage that fewer data attributes for each instance were required at higher levels. Therefore, using this model decreases the problem of dealing with missing values, and provides a framework to establish a parsimonious sequence of tests for diagnosing thyroid diseases. 1. Background In most real-life situations, medical decision making is done in absence of complete information. Diagnostic tests may be ordered to decrease uncertainty, but actions take place before all results become available. The actions (which could be ordering of new tests, or prescribing a treatment) may change the course of the disease. Cases that are resolved in this initial phase may never be assigned a final diagnosis. Conversely, further investigation may yield a more precise diagnosis. The diagnostic process is then repeated, until no additional information is necessary. Yet, the decisions made early in the diagnostic process -usually in the absence of complete information -play a key role on patient outcomes. These decisions are based on partial diagnoses derived from a limited set of observations. Partial diagnoses are key components in medical reasoning [Pople, 1982], usually consisting of syndromic, rather than etiologic, diagnoses. Thyroid diseases are classified in two major classes: hypothyroidism and hyperthyroidism . Each of these classes can be further divided according to the etiology of the disease: hypothyroidism can be divided in primary , secondary , and so on. We have built a computer program to help physicians decide whether a patient has hypothyroidism, hyperthyroidism, or normal thyroid function by interpreting the results of the patient's laboratory tests, and defining a partial diagnosis. Such a partial diagnosis may be useful in explaining some of the patient's findings, in helping a clinician to make decisions regarding what diagnostic tests to order next, and in helping the physician decide which medications may be appropriate (even though this partial diagnosis may not be sufficient to allow the clinician to decide on the optimal therapy). The system produces useful results early in the course of the investigation, when only scarce information is available. In cases where the system determines that the patient’s thyroid function is not normal, further processing occurs, and a final diagnosis is suggested. Many taxonomies of diseases (nosologies) are structured in a hierarchical fashion [Gara, Rosenberg, and Goldberg, 1992]. This type of classification not only is easier to understand than a flat list of diseases, but also provides a basis that guides the differential diagnosis. It is therefore natural to use a hierarchical classification system to perform medical diagnosis. Several authors have used this approach when building medical expert systems, or rule-based systems [Weiss, Kulikowski, Amarel, and Safir, 1978]. Although performance may be acceptable, problems with expert systems usually occur during the knowledge-acquisition phase, when a great amount of time is spent on extracting information from the expert [Forsythe and Buchanan, 1989]. Furthermore, expert judgment may contain biases [Tversky and Kahneman, 1974], a problem that machine-learning approaches, by extracting information from evidence, may also avoid. Hierarchies of neural networks are not new. Ballard proposed them as a solution to the problem of building large networks [Ballard, 1990]. He developed a modification of the backpropagation algorithm to be applied to these hierarchies; he reported that preliminary computer experiments showed that his approach resulted in a better performance in large problems in terms of time and accuracy than did the approach that uses the backpropagation algorithm with several internal levels. He did not report specific results of these studies. Our motivation for using hierarchical networks was somewhat different. Although we were concerned with the scaling problem, the objective of our project was to develop a hierarchical network that would perform partial diagnosis accurately and parsimoniously. We applied the backpropagation algorithm to a sequence of networks, so that each network was trained in a supervised way. The first level of this hierarchical system was presented with fewer data attributes than were given to the more specialized level. The purpose of this first network was to establish a partial diagnosis of hypothyroidism , hyperthyroidism , other conditions , or no disease . Curry and Rumelhart used hierarchical networks to classify mass spectra [Curry and Rumelhart, 1990]. Our system is based on their architecture, except that we did not incorporate extra units for representing the degree of confidence in the top-level network’s results. Curry and Rumelhart did not compare the hierarchical system with its nonhierachical counterpart. Other approaches using neural networks involve preprocessing of data by several statistical techniques, usually in a nonsupervised manner [Hrycej, 1992]. Frean has proposed a method for constructing the hierarchical networks dynamically, but concepts associated with each intermediate level did not have a specific meaning, as they do in our system [Frean, 1990]. Alternatives to building a supervised hierarchical classifier outside the field of neural connectionist systems include piecewise linear machines, as described by Nilsson [1965], and classification trees [Breiman, Friedman, Olshen, and Stone, 1984]. Hierarchical classification in medical domains has been done in a few cases. Ash and Hayes-Roth have studied the use of action-based hierarchies in a surgical intensive-care unit [Ash and Hayes-Roth, 1993], and there are rule-based systems that rely on hierarchical classification [Weiss, Kulikowski, Amarel, and Safir, 1978]. 2. Material and Methods We used the set of cases of thyroid diseases provided by Quinlan [1987], and distributed by the University of California at Irvine [Murphy and Aha, 1992]. It consists of more than 9000 instances, each with 29 attributes. A previous version of this database was used by Quinlan to show the implementation of decision trees [Quinlan, 1986]. There are continuous and discrete values, as well as many missing values. Input consists mainly of values for laboratory-test results. There are 20 classes for output, which can be grouped in at least four superclasses. Data were collected from 1984 to 1987 in an Australian medical institution. Similar data were also used previously in a neural-network implementation [Schiffmann, Joost, and Werner, 1992]. The authors described the difficulty that the system had in learning the patterns. They tried different variations of backpropagation, and studied the variability of learning associated with variation in learning rate and momentum. As in Quinlan's experiments, their problem was just to classify whether or not the patient had hypothyroidism. The authors were not concerned with learning both partial and final diagnoses. Weiss also used a similar set of data to compare different machine-learning algorithms in the domain of thyroid diseases, showing that the smaller error rates in the testing set were associated with neural networks of nine hidden units, trained by a backpropagation variant [Weiss and Kulikowski, 1990]. 2.1. Multiple Neural-Network Architecture Two top-level networks that determined partial diagnoses (triage neural networks) consisted of multilayered perceptrons (MLPs), with inputs provided by the reduced set of data attributes (20 inputs in the case of the first partial networks), or the complete set of data attributes (23 inputs in the case of the other networks). We varied the number of input attributes to measure the importance of the three additional attributes to the determination of the partial diagnosis. The attributes were laboratory values that could be left out in the first clinical assessment of thyroid diseases (T3, T4, and TBG). Figure 1 shows the architecture of the triage networks, and Figure 2 shows the architecture of the specialized network. The complete set of data (23 inputs) was presented to the generic network, in which the final diagnoses corresponded to output units. Figure 3 shows the architecture of the generic network. FIGURE 1. Triage network. Inputs are clinical and laboratory data; outputs are first partial diagnoses. TSH is the thyroid-stimulating hormone, and T4U is the thyroxine resin uptake, T3 is the triiodothyronine, TT4 is the total thyroxine, and TBG is the thyroxine-binding globulin. Hidden layer Patient data
منابع مشابه
AN INTELLIGENT FAULT DIAGNOSIS APPROACH FOR GEARS AND BEARINGS BASED ON WAVELET TRANSFORM AS A PREPROCESSOR AND ARTIFICIAL NEURAL NETWORKS
In this paper, a fault diagnosis system based on discrete wavelet transform (DWT) and artificial neural networks (ANNs) is designed to diagnose different types of fault in gears and bearings. DWT is an advanced signal-processing technique for fault detection and identification. Five features of wavelet transform RMS, crest factor, kurtosis, standard deviation and skewness of discrete wavelet co...
متن کاملRapid and Simultaneous Determination of Montelukast, Fexofenadine and Cetirizine Using Partial Least Squares and Artificial Neural Networks Modeling
Simultaneous determination of pharmaceutical compounds and accurate quantitative prediction of them are of great interest in the clinical and laboratory-based investigations.This work has focused on a comprehensive comparison of Partial Least-Squares (PLS-1) and Artificial Neural Networks (ANN) as two powerful types of chemometric methods. For this purpose, montelukast (MONT), fexofenadine ...
متن کاملAnalysis and Diagnosis of Partial Discharge of Power Capacitors Using Extension Neural Network Algorithm and Synchronous Detection Based Chaos Theory
Power capacitors are important equipment of the power systems that are being operated in high voltage levels at high temperatures for long periods. As time goes on, their insulation fracture rate increases, and partial discharge is the most important cause of their fracture. Therefore, fast and accurate methods have great importance to accurately diagnosis the partial discharge. Conventional me...
متن کاملDiagnosis of brain tumor using PNN neural networks
Cells grow and then need a very neat method to create new cells that work properly to maintain the health of the body. When the ability to control the growth of the cells is lost, they are unconsidered and often divided without order. Exemplified cells form a tissue mass called the tumor. In fact, brain tumors are abnormal and uncontrolled cell proliferations. Segmentation methods are used in b...
متن کاملApplication of Radial Basis Neural Networks in Fault Diagnosis of Synchronous Generator
This paper presents the application of radial basis neural networks to the development of a novel method for the condition monitoring and fault diagnosis of synchronous generators. In the proposed scheme, flux linkage analysis is used to reach a decision. Probabilistic neural network (PNN) and discrete wavelet transform (DWT) are used in design of fault diagnosis system. PNN as main part of thi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994